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A communication system is studied in which two users communicate 
with one receiver over a common discrete memoryless channel. The infor- 
mation to be transmitted by the users may be correlated. Their information 
rates are described by a point in a suitably defined three-dimensional 
rate space. 

A point in this rate space is called admissible if there exist coders and 
decoders for the channel that permit the users to transmit information over 
it at the corresponding rates with arbitrarily small error probability. The 
closure of the set of all admissible rate points is called the capacity region, 
C, and is the natural generalization of channel capacity to this situation. 

In this paper we show that Q, which depends only on the channel, is 
convex and we give formulas to determine it exactly. Several simple 
channels are treated in detail and their capacity regions given explicitly. 

I. INTRODUCTION 

The mathematical theory of communication has been concerned, for 
the most part, with the reliable transmission of information from a 
single information source to a single user. An extensive literature exists 

* J. K. Wolf is Professor of Electrical Engineering at the University of Massachu- 
setts, Amherst, Mass. Partial support for his research on this paper was furnished 
by the Air Force Office of Scientific Research under contract F-44620-72-C-0085. 
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on this problem: the basic concepts are contained in the classic papers 
of Shannon. 1 

In this work we consider the case in which messages from a set of 
information sources are communicated over a common channel to a 
single receiver. We impose constraints on the encoding techniques 
which can be employed. 

A precise formulation of the problem is presented in a subsequent 
section. Here we describe in less mathematical terms the type of 
problem considered. 

A particular multiple access communication channel with two inputs 
and one output is shown in Fig. 1. Here the two inputs, Zi and X 2 , and 
the output Y each take values from the set {0, 1}. The conditional 
probability of the output Y for each of the four possible input pairs 
(X h X 2 ) is also shown in the schema at the right in Fig. 1. 

It is clear that if the transmitters can cooperate with each other 
they can transmit without error one bit per channel use by transmitting 
either the pair {X x = 0, X 2 = 0) or the pair (Xi = 1, X 2 = 1). Such 
would be the case if a common binary source were connected to both 
inputs without any coding. If a message is to be transmitted by con- 
necting it to only one input, say to input 1, and if the other input is 
unaware of the message, then even if no information is to be transmitted 
through input 2, the information rate for input 1 must be substan- 
tially less than one bit per channel use in order to achieve reliable 
communication. If two independent messages are to be connected 
separately to the inputs — message 1 to input 1, message 2 to input 2 — 
the situation is even more difficult. 

A general configuration that we consider is shown in Fig. 2. Three 
sources emitting statistically independent messages at rates Ro, R h and 
Rt are connected to a multiple access channel via two encoders. The 
messages from source 1 and source are inputs to encoder 1 and its 
output is connected to one of the input terminals of the channel. 



INPUT 1 



INPUT 2 



<i « H 



x,x 2 





*■ J fc 


MULTIPLE ACCESS 

COMMUNICATION 

CHANNEL 


Ye {0.l} 




x 2 , {0,1} 


1 


OUTPUT 


1 








1 1 




Fig. 1 — A multiple access channel. 
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Fig. 2 — Multiple access channel with correlated sources. 

Encoder 2 has as inputs the messages from both source 2 and source 
and its output is connected to the other input terminal of the channel. 
The channel output is connected to a decoder which estimates the three 
source messages. It is convenient to represent the rates of the three 
message sources by a point in a three-dimensional rate space. 

For each given channel of the sort just described, there are certain 
rate triplets, Ro, R\, R2, for which it is possible to attain arbitrarily 
small probability of error in the system output by using sufficiently 
clever encoding and decoding schemes. For other points in the rate space 
this is not possible. We call the closure of the set of rate points for 
which the error probability can be made arbitrarily small the admissible 
rate region or the capacity region for this channel. It is a natural general- 
ization to the multiple access channel of the channel capacity that is 
associated with the more commonly studied channel having a single 
input and a single output. 

The main result of this paper is a complete determination of the 
capacity region Q. A typical case is shown in Fig. 3. The region always 
lies in the first octant and is bounded by the planes Ro = 0, R\ = 0, 




Fig. 3 — An admissible rate region. 
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R 2 = and a convex surface. The equations that describe e will be 
shown to involve various conditional and unconditional mutual infor- 
mations. This is analogous to the single-user channel where the capacity 
is calculated from a mutual information. 

Problems resembling ours have been treated by several authors. 
Shannon, 2 and then Van der Meulen, 3 consider a two-way channel with 
two inputs and two outputs. The configuration of the encoders and 
decoders is different than in our model, so that the problems are not the 
same. One similarity, however, is that the two sources are described by 
a pair of rates which are represented by a point in rate space. For 
certain points, encoders and decoders exist for which the probability of 
error can be made as small as desired. 

The multiple access channel has been investigated by Liao, 4 Van der 
Meulen, 6 and Ahlswede. 6 Liao 4 and Ahlswede 8 both prove a coding 
theorem and a converse for the case of independent sources. Our 
results reduce to theirs for the case Ro = 0. Correlation in the sources 
adds a totally new dimension to the problem (and literally to the region 
of admissible rates). 

A problem which is the dual of the one considered here is the broad- 
cast channel investigated by Cover 7 and Bergmans. 8 There, a channel 
with one input and two outputs is considered along with a single 
encoder and two decoders. Again the concept of an admissible rate 
region applies. 

A brief outline of the paper follows. In Section II a detailed problem 
formulation is presented. Section III summarizes the main results of 
the paper and gives some examples. Sections IV and V and the as- 
sociated appendixes give details of the derivation of a coding theorem 
and a converse. A more useful description of the admissible rate region 
is given in Section VI. We conclude in Section VII with some generaliza- 
tions and comments. 

II. PROBLEM FORMULATION 

Consider the block diagram shown in Fig. 4. The three sources are 
described by a three-dimensional rate vector R = (Ro, Ri, R2) with non- 
negative components. For a fixed positive integer N, we define the 
components of the vector M by 

M = M(R, N) = (M 0l M h M 2 ), (la) 

Mi - "V*^, i = 0, 1, 2, (lb) 

where r x~\ is the smallest integer greater than or equal to x. Every N 
time units the sources produce a triplet of numbers (i, j, k) that are the 
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Fig. 4 — Notation for multiple access channel. 

corresponding values of the random variables (U , U h U 2 ). These 
random variables are assumed to be statistically independent and 
uniformly distributed over the rectangular lattice of dimensions 
-Mo X Mi X M j. That is, their joint probability distribution is 

Pvov&tii, j, k) = Pr[c7 = i, U x = j, U 2 = k~]= \/MMxM 2 , 
i e (1, 2, • • • , Mo) = /a, 

j e a, 2, . . • , mo ee / x , (2) 

A;(E(1,2, ••• ,M 2 ) ^/ 2 . 

The channel is a probabilistic mapping which every unit of time maps 
a pair of real numbers (z h x 2 ) to the real number y. The real numbers 
xi, x 2 , and y belong to the finite alphabets 9Ci, SC 2 , and % respectively. 
The mapping is governed by the conditional probability distribution 
PriXiXtiylxi, x 2 ) for all .Ti in BCi, x 2 in £C 2 , and y in <y. Here we describe 
the inputs by the pair of random variables (X h X 2 ) and the output by 
the random variable Y. Throughout this paper, it will be assumed that 
PnXiXi is specified a priori and cannot be altered. 

To describe how the channel processes sequences of N input pairs, 
we define the iV-vectors 
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and in a similar way the corresponding random vectors Xi, X 2 , and Y. 
Here (ECi)^ is the set of all iV-vectors whose components are in 9C&. The 
sets (SCO* and (y) N are denned analogously. We assume the channel 
is stationary and memoryless; that is, 

N 

P&(y I x i. x 2) = 5 PyxXiXtiVt I *u, x 2t ) . (4) 

The superscript N on the joint probability distribution indicates the 
dimension of the vectors. 

The encoders are deterministic mappings from the source outputs to 
channel input vectors. Encoder 1 is a mapping from the source pair 
(i, j) to an N-vector Xi E (9Ci)' v - The functional form for this mapping 
is written 

xi = fjr(i, j), i e h, j E h, X! e (SCO*- (5) 

Similarly, encoder 2 is a mapping from the source pair (i, k) to the 
iV-vector x 2 E (^Ci) N - The functional form for this mapping is written 

x 2 = gw(i, k), i Eh, kE h, x 2 E (SC 2 ) W . (6) 

The collection of (M„ X Mi + M X M 2 ) iV-vectors which result from 
these mappings is called a code of block length N. Usually, we will adopt 
the more suggestive notation x U j and x 2ik instead of f N (i, j) and g N (i, k). 
To summarize the operation of the sources, encoders, and channel we 
note that: 

(i) Every iV time units, the three sources produce a triplet (i, j, k). 
(ii) The two encoders act upon the source outputs to produce the 

two JV-vectors Xi.-j and x 2 ,-*. 
(in) The components of these vectors are impressed upon the 
channel, one pair of inputs each time unit. Corresponding to 
each pair of inputs the channel produces an output, so that in 
the N time units the channel produces an output iV-vector, y. 

The decoder is a deterministic mapping from the vector y to the 
triplet (** j*, k*) where •* E h, j* E h, k* E h. We describe this 
mapping by (t* j*, k*) = h^(y). The triplet of decoder outputs is 
denoted by the vector random variable ( Uq, U\, U 2 ) ■ 

The deterministic mappings (f N , gN, h N ) will be called a coding. A 
coding with rate vector R = (Bo, R h R 2 ) and block length A^ will be 
denoted by Cjy(R). For a given coding, we can in principle calculate 
the probability of the error event 8, where 8 C (the complement of &) 
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is defined as 

8 C = event { U* = U and U\ = U x and U\ = U%] . (7) 

For the coding C N (R), we denote the probability of the error event 
(hereafter called the probability of error) by P e (C N (R)). 

A rate vector R will be said to be admissible if, for every e > 0, there 
exists a positive integer N and a coding C N (R) such that P e (C N (R)) ^ e. 
The closure of the set of admissible rate vectors is called the admissible 
region or capacity region, and is denoted by e. Our purpose is to specify 
6 for an arbitrary, discrete, memoryless, multiple access channel. 

III. SUMMARY OF RESULTS AND EXAMPLES 

The main results of this paper are two alternative descriptions of the 
admissible rate region Q for any discrete memoryless channel. The 
proofs that these yield the correct region are contained in the remaining 
sections of the paper. Here we discuss only the simplest of the results. 

We shall have much need of conditional mutual information expres- 
sions in the sequel. We remind the reader of the definition 



/(A; B | C) = E Z E ^ABc(i, j, k) log *gg&JJfc> 
Here 



(8) 



PABc(i, j, k) = Pr \_A a = i a , B p = j g , C y = k y , 

a = 1, 2, • • • , L; = 1, 2, • • • , M; 7 = 1, 2, • • • , JV] 

is the joint distribution function of the discrete random variables 
A\, A 2 , ■ ■ ■ , Al, B\, B 2 , • • • , Bm, C\, d, • • • , Cn- The conditional 
distributions Pabic^, j|k), etc., are defined in the usual way. 

Let us return now to consider a discrete memoryless channel with 
input alphabets ECi and 9C 2 , output alphabet % and transition prob- 
abilities P Y \x l x i (y\x h x 2 ), Xi £ ECi, xi E %2, y E % Let Z be a random 
variable which takes on values in the set 

b = 11,2, ■•• ,M\. (9) 

From any set of three distributions Pxi\z(xi\z), Px 2 \z(x 2 \z), and P z (z), 
xi E 9Ci, x 2 E 9C2, z E 3, form the joint distribution 

PzxiX,y(z, Xi, x 2 , y) 

= Pz(z)Px l \z(xi\z)Px 2 \zMz)P Y \x 1 x 3 (y\x h Xi). (10) 

Now denote by (R(Pza- 1 x 2 k) the set of vectors R = (R , R lf R 2 ) such that 

0£ Ri^ I{XuY\X 2 ,Z), (11a) 
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0^ R* ^I(X 2 ;Y\X h Z), (lib) 

^ fli + R 2 ^ I(X h X 2 ; Y\Z), (lie) 

^ Ro + fli + ^2 ^ I(X h X 2 ; Y), (lid) 

where the mutual informations are computed according to (8) using the 
joint distribution (10). This region is a polyhedron such as is shown in 
Fig. 7, Appendix I. Then the admissible rate region C is given by 

6 = closure of the convex hull (J ®-(PzxiX % y), (12) 

where the union is taken over all possible choices of Px^z, Px a \z, and 
Pz, and all values of M, the size of the d alphabet. 1. 

To obtain the intersection of the admissible rate region e with the 
plane R a = 0, the size of the alphabet d can be set equal to 1. The 
random variable Z no longer appears in the equations. For Ro = 0, we 
then define <R(Px u Px t ) as the set of vector R = (0, R it R») such that 

^ Ri ^ I(Xu Y\X 2 ), (13a) 

^ R 2 ^ I(X 2 ; Y\Xi), (13b) 

^ fti + Rt ^ I(Xi, X 2 ;Y). (13c) 

Then 

Q\ Ro „ a = closure of the convex hull of U <R(JPx M Px,), (14) 

where the union is taken over all possible choices for the unconditional 
distributions P Xl and P Xv This is the solution found by Liao 4 for 
uncorrelated sources. 

Other equations for specifying the region e are given in Section VI. 
They involve the calculation of mutual informations among long 
sequences of random variables and thus do not appear to be useful for 
computation. 

Quite generally, e is convex. It is always bounded by portions of the 
three coordinate planes and a surface which encloses a finite volume 
in the first quadrant of rate space. If R = (Ro, Ri, Ri) is in C, then for 
any 5 = (8 , 8 h 8 2 ) satisfying ^ Si g. Ri, i = 0, 1, 2, the rate vector 
5 is also in Q. 

In the remainder of this section, some simple examples are presented 
for which the admissible rate region has an explicit characterization. 



f We suspect that it suffices to consider only values of M ^ r e R °1, but have not 
been able to prove this conjecture. 
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Example 1 {Multiplier Channel) 

Both the inputs, Xi and X 2 , and the output, Y, for this channel 
take values and 1. The output is the product of the two inputs. For- 
mally, Ed = 9C 2 = «y = {0, 1} and Pr.W0|0,0) = PnxiXt(0|0, 1) 
= Py\x 1 x 2 (0\1, 0) = PkiXiX»(1 1 1, 1) = 1, and all other conditional 
probabilities are zero. Note that the channel is deterministic. 

The pyramid described by the planes 

R a = 0, Ri = 0, R 2 = 0, 
fio + Ri + Ri = log 2 (15) 

must contain the admissible rate region 6, as is seen from (lid) since 
^ R + Ri + R 2 ^ I(X h X 2 ;Y) ^ log 2. But the rate vectors 
Ri = (log 2, 0, 0), R 2 = (0, log 2, 0), and R 3 = (0, 0, log 2) are all 
admissible by the following strategies: 

Ri: Choose N = 1, M = 2, Mi = M 2 = 1 and use code words 

Em = 0, Xui = 1, .r 2 n = 0, x 22 i = 1. 
R 2 : Choose N = 1, M = 1, Mi = 2, M 2 = 1 and use code words 

£iii = 0, x n2 = 1, a,- 2 n = 1. 
R 3 : Choose N = 1, Mo = I, Mi = 1, M 2 = 2 and use code words 

£lll = 1, #211 = 0, &218 = 1. 

The probability of error for these codes is zero. Since the convex hull 
of these three rate points and the origin is the set bounded by the 
planes (15), the capacity region Q is as shown in Fig. 5. 

By similar arguments, we find that Fig. 5 gives the region of admis- 
sible rates for many other binary-input, binary-output deterministic 




LOG 2 

Fig. 5 — Admissible rate region for the multiplier channel. 
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channels (ones with all transition probabilities equal to zero or one). 
Degenerate cases exist, however, in which the region C reduces to a 
portion of a plane. For example, if Pyixix 2 (0|0, 0) = Pyix,x 2 (0|0, 1) 
= Pnxix,(l|l, °) = *Wix,(l 1 1, 1) = 1 and all other probabilities 
are zero, it is easy to verify that C = {R = (Ro, Ri, R2) '• ^ Ra + Ri 
^ log 2,R 2 = 0, R lt Ro^O}. 

Example 2 (Symmetric Noisy Channel) 

Let 9Ci = 9C 2 = {0, 1}, «y = (0, 1, 2, 3} and let Pn^x.folsi, * 2 )be 
given as shown in Table I. Let M be as in (9). Define Pz(zi) = 7», 
PxM0\zi) = <*i, Px,iz(0|2,-) - Pi, i = 1, 2, • • • , M. Straightforward 
calculations then yield 



I(X i; Y\X 2 ,Z) = t 7i(/i(av, p) - K(p)), 



M 



I(X 2 ; Y\Xi, Z) - £ 7<(/iCfc, p) - X(p)), 



>=i 



/(Xi, Z 2 ; F|Z) = £ 7i(/2(«<, fr,p) - K(p)), 



(16) 
(17) 
(18) 



where 



/i(*,p)=|piog|+((i-p)a + |d-«)) 



Xlog 



(1 - p)5 + I (1 - 5) 



+ (!*+(i-p)(i-*)) 



Xlog 



|« + (l-p)(l-8) 



1 3 

X(p) = (1 - p) log (1 _ p) + p log - , 



(19) 
(20) 
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and 

/*(«, 0, P) = [(1 - p)c# + | (1 - a/3)l log 



(1 - p)a/3 + | (1 - «/3) 



+ ((1 - p)a(l - /3) + | (1 - a + a/5)) 

Xlog 1 

(1 - p)a(l - /3) + | (1 - « + «/3) 

+ ((1 - p)/3(l - a) + | (1 - + a/3)) 



X log 



(1 - p)/3(l - a) + I (1 - p + a/3) 
+ ((1 - p)(l - a)(l - 0) + | (a + /3 - a/8)) 

X log (21) 

(1 - p)(l - a)(l - 0) + | (« + - a/3) 

It is easy to show that ftf, p) ^ ftf, p), Ha, /3, p) =g Hh *, p) 
= log 4. Therefore, the three mutual informations in (16), (17), and 
(18) are simultaneously maximized by setting at = /3, = \, i = 1, 
2, • ■ • , M. Furthermore, 

I(X h X 2 ; Y) = H(Y) - H(Y\XiX a ) = H(Y) - K(p) ^ log4 - K(p) 

with equality when on = /S, : = \ f i = I, 2, • • •, M. Thus all four 
mutual informations, I(X X \ Y\X 2 , Z), I(X 2 ; Y\X h Z),I(X lf X 2 ;Y\Z), 
and I(Xi, Xo\ Y), are maximized for the same choice of the parameters 
on, /3„ and y i} and the maximum values are independent of M. The 
capacity region for this channel then is given by 

^ Rx ^ M, p) - K(p), (22) 

^ R 2 ^ m, p) - K(p), (23) 

^ R + Rt + R 2 £ log 4 - K{p). (24) 

This region is shown in Fig. 6. 

IV. EXISTENCE OF CODINGS WITH SMALL P e 

In this section we outline a proof of the existence of codings which 
have vanishingly small probability of error for certain values of the 
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LOG 4 - K (p) 




— f, (1/2. p) - K (p) 



— LOG 4 - K <p) 



Fig. 6 — Admissible rate region for the symmetric noisy binary channel. 

rate vector R and sufficiently large block length N. Tedious details are 
relegated to the appendixes. A random coding argument is used. We 
calculate the average probability of error for an ensemble of codings, 
then argue that there must exist at least one member of the ensemble 
having error probability as small as this average. Actually, we can only 
compute an upper bound for this average error probability, but this 
bound is sufficiently small for our purposes. 

For every coding, we shall use the same form of decoder mapping. 
Assume for the moment that the block length N, the rate vector R, and 
the encoder functions f N (i, j) = *uj and g N (i, k) = Xuk are fixed. For 
each y E Cy)' v , the decoder computes the M „ X Mi X M 2 numbers 

*Hft*(y|xWf x 2 «), i G 7o, 3 € h, kEh. 

Then h(y) = (i , jo, k ) if and only if (*„ j„, k„) is the smallest triplet 
(in lexicographic order) such that 

niWy I xi, 0>0 , x 2l0 , ) ^ PflWy I x io, *uk) (25) 

for all (i, j, k). Such a decoder mapping achieves a maximum likelihood 
decision among the possible source outputs. 

We now describe the class of codings for which we obtain an upper 
bound to the average probability of error. It is specified by two positive 
integers, K and N = KL, where L is a positive integer, by a rate vector 
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R, and by a particular probability distribution for the random variables 
Xi, X 2 , and Z. The vectors Xi and X 2 are if-dimensional and take on 
values in (SCi) A ' and (£2)* respectively, the spaces of channel input 
K-vectors. The random variable Z takes values from an alphabet 3 of 
size M as in (9). The joint distribution of these quantities is restricted 
to have the form 

P% iX2 (z, Xi, x 2 ) = Pjfizixt I z)Pjg,(x, I z)P z (z) (26) 

and we denote this collection of distributions by (Pk- A class of codings 
is thus specified by K, N = KL, R, and a Pj£x, E ®k. 

Now let K,N = KL, R, and PJBa* 6 (Pk be given. A set of iV-dimen- 
sional code vectors Xui, • ■ ■ , x n .\f 1 , x 2 n, • • • , Xgiir, in the corresponding 
ensemble of codings is obtained as follows. Choose a sample, say z, 
from the distribution Pz(-)- Next independently choose Mi K- vectors 
from Pk*3z(- 1 2) and then M 2 tf-vectors from Pj^zO \z). These are 
respectively the first K components of the A r -vectors Zm, • • • , Xhj/„ 

X211, • • • , X 2 n/ 2 . 

To obtain the next K components of the code words, independently 
choose a new sample z from Pz(-) and repeat the process. After a total 
of L drawings from Pz(-) the specification of the JV-vectors Xm, • • • , 
XuMn X211, • • • , x 2 ia/ 2 is complete. The entire process is then repeated 
to obtain the remaining code words — those with second subscript 
equal to 2, 3, • • • , Mo. 

We now seek an upper bound to the average probability of error for 
the codings in this ensemble, an average in which the probability of 
error for each particular coding is weighted in accordance with its 
probability of occurrence in the ensemble. We denote this average prob- 
ability of error by P e (N, Pjjgx.) and we denote by PeHjAN, Pjfi«J 
the average probability of error given that the source triplet (i, j, k) 
was presented for transmission. A useful result is 
Theorem 1: The average probability of error conditioned on the source 
triplet (i, j, k) has an upper bound 

PeujAN, PiExJ ^ £ exp I -NLE a ( Pa , Pjfc.) - P( ^.]}, (27) 
0=1 

where ^ p« g 1, a - 1, 2, 3, 4, 



it = 



Ri, 


a = 1 


R2, 


a = 2 


Ri + R2, 


a = 3 


Ro -j- R\ -j- R2, 


a = 4 



(28) 
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and 

Eibu pa*) — j^iei n^(x 2 u)Pz(2) 

X (E W^I*)( i W*0rl»*)) 1/<,+ " ) ) H -". (29a) 
XI 

^ 2 (P2, Pj£&) =-^nIEI P£tzMz)Pz(z) 

X (E^ffl«(x«l*)(i , Wt*(yl*i.*»)) 1/(1+w) ) ,+M , (29b) 

E*( Pi , PJ&.) = - 4 Ai L E P*« 

iv y z 

x(LE«B«0til*)rtB»0ftW 

XI X2 

x (Pffi 1 x,(y|xix 8 )) 1 /< 1+ «0 1+ '", (29c) 
^(P4, Pi& f ) =-i&Z(EE PiS,(xi, x 2 ) 

-i^ y xi xn 

X (P«U(y |xi, x,)) 1/<l+M) ) l+ ' 4 ^ ^- 4 . (29d) 

A proof of this theorem is given in Appendix A. It follows closely the 
proof given in Gallager 9 for the single-input, single-output channel. 

Since the bound proved in the theorem is independent of the triplet 
(i, J, k), we see that this same bound applies to the unconditioned 
average probability of error P e (N, PjSfc&). Finally, for fixed N = KL, 
and P$x„ there must be at least one coding in the ensemble with 
probability of error no greater than the average probability of error. 
Thus we have 

Theorem 2: For every 'positive integer K, for every positive integer N that is 
an integral multiple of K, for every joint distribution Pj£z, of form {26), 
and for every rate vector R, there exists a coding Cn(R) such that 

P e (C N (R)) ^ L exp {-N£E a ( Pa , Pf&x.) - pJL]} (30) 

a=-l 

for all p„0gp,|l,a= 1, 2, 3, 4. The E a and R a are given by (28) 
and (29). 

For a given Pj£z, and for certain values of the rate vector R, the 
upper bound decreases exponentially in N. For these values of R, by 
making N sufficiently large, we can insure a small probability of error. 
We now determine for what rate vectors this is the case. 
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Define (R(Pj£xiY) as the set of rate vectors R for which 

0£ J2i<p/(X i: Y|X 2| Z) (31a) 



K 

1 



0^5, <i/(X,;Y|X 1| Z) (31b) 



=£ fii + R 2 < I 7(Xx, X 2 ; Y | Z) (31c) 

^ flo + Ri + fi a < | /(X l7 X 2 ; Y), (31d) 

where the mutual informations are evaluated under the distribution 
PiBw(«, Xi, x 2 , y) = P^> x ,(z, Xi, x 2 )P Y f XlX ,(y | Xl , x 2 ), (32) 

where Pjfa is given by (26) and Pflfc*. by (4). 
In Appendix B we prove 

Theorem 8: For every e > and every rate vector R C <R(P£8x,y), Mere 
e.ris£s an L and a sequence of codings, C N (R), such that 

P.(CV(R)) ^ e for every N = KL, L ^ L„. (33) 

This theorem holds for all Pffix, of the form given in (26), that is, 
for all P$% lXl £ 6> K . Now define 

(R* s U 0KP1SX.T), (34) 

and finally define 

<R = U &K, (35) 



A' 



where K = 1, 2, • • •. We then have the following main result: 

Theorem 4: For every e > and for every rate vector R C At, Mere eras* 
yaZwes o/ X and Lo and a sequence of codings C#(R) such that 

P„(B A r(R)) ^ e /or eyery N = KL, L ^ L . (36) 

Note that if we use the statement of Theorem 1 instead of Theorem 2, 
Theorem 4 becomes 

Corollary 1: For every e > 0, for every message triplet (i, j, k), and for 
every rate vector R C (R, there exist values of K and L and a sequence of 
codings C N (R) such that 

Peu jk (C N (R)) g e for every N = KL, L ^ L . (37) 
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V. CONVERSE THEOREM 

In this section, we present a series of lemmas and theorems which 
yield a converse to the Coding Theorem 4. Let & denote the closure of 
the region <R given by (35) and let & c be the complement of & with 
respect to the first octant. We shall ultimately show that every coding 
Cx(R) with R E & c transmits with a probability of error not less than 
a constant 8 > which is independent of K. 

Our notation is as before except that K, instead of N, will be used 
for the block length of a code. Let K, R, and the channel be given. The 
associated vector M with components 

M a = re K *'l, a = 0, 1, 2, (38) 

is then determined. We shall no longer be concerned with ensembles of 
codes, but rather fix our attention on some given encoding functions 
xiii = i(i, j), x 2f * = g(i, k), where i E h, j E h, and k E h. These 
vectors need not be distinct. Then with the source statistics given by 
(2), the given encoding defines a joint probability distribution 

PffirxuXJUid, j, k, xi, x 2 , y) = P¥fab(j\Xi, x 2 ) 

X QffiiwxCxi | i, 3)Q%u.u,(*t I *'. AOPcvW*. h k) (39) 

for the random variables in question. Here 

QttWxilt, j) = **nw < 40a ) 

Q&mvMi,® - ««.««*. ( 40b ) 

where the right-hand terms are Kronecker deltas. Entropies and 
mutual informations can then be calculated from (39) by the usual 
formulas. 

Several more definitions are needed. We shall make use of the rate 
number vector R' = (Ro, R[, Ri) given by 

R a = i log M a ^R a a = 0, 1, 2 (41) 

and the elementary entropy function 

h(x) m - x log x - (1 - x) log (1 - x) . (42) 

Finally, we define 

P.i(Cx(R)) - Pr [01 ^ Uil (43) 

P e2 (C K (R)) = Pr LUt * Utl (**) 

P e3 (C K (R)) = Pr [t/t * Vx or Ul * U{\. (45) 
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Then, for the probability of error using the coding C K (R) we have 

P.(C*(R)) = Pr ZUt * t/o or U\ * U x or U\ * t/ 2 ] 

^ max lP e i(C K ), P e2 {C K ), P.i(Cjc)]. (46) 

We now proceed to the first of the lemmas which is a generalization 
of Fano's inequality (Ref. 9, Theorem 4.3.1). The proof is given in 
Appendix C. 

Lemma 1: For every K and R and for every C K (R): 

H(Ui\Y, t/o, t/ 2 ) ^ P.x(C K ) log Mi + h(P el (C K )); (47a) 

H(U 2 \Y, t/o, U x ) ^ P e2 (C K ) log M 2 + h(P e2 (C K )); (47b) 

H{U h U 2 \Y, U a ) ^ P e3 (C K ) log (MM*) + h(P e3 (C K )); (47c) 

# (t/o, U h U 2 \Y) < P e (C K ) log (M MiM 2 ) + h(P e (C K )). (47d) 

The next lemma, proved in Appendix D, is a generalization of the 
data processing theorem (Ref. 9, Theorem 4.3.3). 

Lemma 2: For every K and R and every coding Ck(R): 

(a), I(Ui;Y\U t , U ) ^ 7(X i; Y|X 2 , [/„); (48a) 

(b), I(U t ;Y\U h U») ^/(XajYlXi, U»); (48b) 

(c) , /( Ui, Ui\ Y | l7o) g I(Xi, Xa; Y | U ) ; (48c) 

(d), I(U 0) U h t/ 2 ;Y) ^ 7(Xi, X 2 ; Y). (48d) 

Lemma 3: For every K and R and every coding Ck(R) with rate-number 
vector R' : 

(a), KR[ - J(Xi; Y|X 2 , t/ n ) ^ P e i(C K )KRi + /i(P el (C K )); (49a) 

(b), KR* - I(X 2 ;Y\X h U ) ^ P e2 (C K )KR 2 + h(P e2 (C K )); (49b) 

(c), KM + R 2 ) - /(Xi, X 2 ; Y | t/ ) 

^ Pcz(Ck)K(R; + B a ') + fe(P. 8 (Cjc)); (49c) 

(d), K(Ro + Bi + R 2 ') - I(X h X 2 ; Y) 

^ Pe(C K )K(Ro + A,' + &) + h(P e (C K )) . (49d) 

The proof is given in Appendix E. 

With these lemmas in hand, we return to the matter of establishing 
a converse to Theorem 4. For a given K, R, and encoding C K (R), there 
is established a joint probability distribution between the random 
variables Y, Xi, X 2 , and t/o given by 

QBSmmKi, x 1; Xa, y) = Pffl«.(y I XiX 2 )Qx^ t /o(xi | OQffiokfci | »0O&i(»), (50) 
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where Pyfx.x, is given by (4), 

1 Mi 

Officii* - ) =^ i Z i «x l x 10 -, (51b) 

1 Mt , 

QiWx.|i) = tjt- £ «««*, ( 51c ) 

1V1 2 A; = 1 

jG/o, ii€(acO*, x 2 G(9c 2 ) x . 

Here Cjc(R) is defined by the code words Tui and X 2ifc , i E Jo, j E Ji, 
fc E 1 "a. We denote by £ K the set of all distributions Q$x,x, derived from 
code books, that is, all distributions of the form obtained by summing 

(50) over all y E CJO*. 

With if, R, and an encoding Cjc(R) now fixed, we define S^QiSdwr) 
to be the set of all vectors S = (So, Si, St) with non-negative com- 
ponents such that at least one of the following inequalities is satisfied, 

Si>-gZ(Xi;Y|i7n,X 2 ) > (52a) 

S 2 >-^7(X 2 ;Y|f/n,X 1 ), (52b) 

Si + S 2 >^I(Xi,X,;Y|tfo), (52c) 

So + iSi + S 2 > ^I(Xi, X 2 ; Y) . (52d) 

$fe s n S'(Q&*t) (53) 

s c = n 8r. ( 54 ) 



Next define 
and finally 



Here c denotes complement with respect to the first octant So ^ 0, 
Si ^ 0, S 2 ^ 0. Thus, for example, S(Q$x,x,y) is a closed convex 
polyhedron bounded by seven planes. 

Note the similarity between (50) and (32) and between (31) which 
defines <R(PJj£x,y) and (52) which defines S*(Q{J3uur). R» every dis- 
tribution in J£it, there is a distribution in (P K that will give equality 
between the corresponding right-hand members of (31) and (52) when 
Z and Uo are properly identified. We make this identification by 
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choosing M = M , and by taking Z to be uniformly distributed over 
its M possible values. With a member of (P K identified with each Q$x lX , 
in this way, we see that for this particular P^x, one has 

S«m X ,y) = dKPjfjur). 

Here the caret, *, denotes closure. Comparison of (34) and (53) then 
shows that S* C (ft*. Thus S C (ft or 

R c C S c . (55) 

In Appendix F we establish 

Theorem 5: If R is an interior point of S e , then for every K and every 
encoding Ck(R), 

Pc(C k (R)) ^ 5 > 0, 
where 8 = 5(R) is independent of the encoding and of K. 

VI. SPECIFICATION OF THE CAPACITY REGION 

At the end of Section II, the capacity region was defined as the 
closure of the set of admissible rate points. Theorems 4 and 5 along 
with (55) show that 6 = (ft where (R is defined by (31), (34), and (35). 
This characterization of e is of little computational value. It entails 
the calculation of the mutual informations appearing on the right of 
(31) for all distributions of form (32). A further infinite union over all 
values of K is then required. In this section we shall show how a much 
simpler description of 6 can be obtained, one that is independent of K 
and hence much more suitable for numerical calculations. 

Central to the development of this simpler characterization of e is 

Theorem 6: The region e of admissible rates is convex. 

This theorem is proved by a time-sharing argument in Appendix G. 
By deleting words from a code, one obtains an additional obvious 
feature of the region 6 which we state as 

Theorem 7: Let R£6. Then if ^ ff« g R a , a - 0, 1, 2, the rate 
vector R" is also contained in Q. 

We return to our simpler characterization of C. Let (Ri denote the 
region specified by (31), (32), and (34) for K - 1. Since e is the closure 
of (R as given by (35), <Jh C e. From Theorem 6 it follows that also 

(ft' s convex hull (Ri C e. (56) 

(The convex hull of a set a consists of all points in a and all points 
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on all line segments joining points of a.) We shall soon show that 
indeed (R' = 6. 

As a step in this direction, in Appendix H we establish 

Lemma 4: For every distribution Pjfcx(z, -1, x 2 , y) as in (32), 

J(Xi; Y|Z, X.) ^ £ J(X„j Yi\Z, X 2t ), (57a) 

7(X 2 ; Y|Z, XO ^ £ I(X 2t ; Y t \Z, X u ), (57b) 

«=i 

/(Xi, X 2 ; Y|Z) ^ £ I(X U , Xu, Y,\Z), (57c) 

t-i 

J(Xx, X 2 ; Y)gl J(Z 10 X 2( ; FO • (57d) 

«-i 

Combined with (31) the lemma shows that 

<R*(PJ&,y) 3 <R(PJ&,y), (58) 

where (R*(P^,x j y) is the set of rate vectors R for which 

^ fii g i £ I(Xi ( ; F,|Z, X«), (59a) 

it i 

^ P 2 ^ \ f W«; Fi|Z, Xu), (59b) 

o^, + ij 2 4s w* Z2 <; Ft > z )' (59c) 

A. 1 

^ i2 + «i + P 2 ^ i £ Wii, *«; F t ), (59d) 

A i 

where the right sides of (59) are evaluated under PSjX,y- Note that 
(R*(P2X ) 1 x 5 y), unlike (R(P1£x,y), is a closed set by definition. 

Now, a typical term on the right of (59) depends only on the marginal 
distribution Pzx u x it Y t (z, x u , x 2t , yd- By summing (32) over the ap- 
propriate indexes and taking account of (26), it is seen that this 
marginal can be written 

Pzx u x U Y t (z, x u , x it , yt) 

= Pz{z)Px^z{xu\z)Px^z{x> i t\z)PY t \XuX it {yt\xu, Xu) 

which is a distribution of the form PJ&.Y for K = 1. Thus the right- 
hand sides of (59), which are the parameters denning the box-like 
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region (R*(.PJ£ ) 1 x 1 y), are averages of parameters that define the box-like 
region (R(P z l XuXuYl ) , t = 1, 2, • • • , K. In Appendix I this fact and the 
convexity of the box-like regions are used to show that 

convex hull (J &(?&„*„*) 2 <**(P&jl,y) (60) 

from which it also follows that 

convex hull \J <k(P& lXlY ) D (J (R^xix.Y,). (61) 
We now have 

(R' = convex hull (Jh = convex hull closure (J <SL(P&*mt) 

p z %x^ 
2 convex hull U A(P^ lF ) 2 U (R^P&.y) 

3 U fflfflw). (62) 

Here the last inclusion follows from (58) and the next to last inclusion 
is (61). Using (34) and (62), we now see that (R' 2 (R K for every 
K. From (35) then (R' 2 (R, and since (R' is closed by definition, 
(R' 2 (R = C. Combined with (56), this shows that <R' = e, and the 
formulation (10)-(12) is thereby established. 

It is to be noted that while this reduction permits calculation of 6 
by evaluating mutual informations involving no more than four random 
variables, the size of the Z alphabet is unrestricted. In this connection, 
see the footnote in Section III. 

That we can indeed take the size of the Z alphabet to be 1 when 
computing the intersection of e with the plane R = 0, as claimed in 
Section III, is seen as follows. When R a = 0, (lid) is weaker than 
(lie), since always I(X h X 2 ; Y) ^ I(X h X 2 ; Y\Z). Thus we need only 
consider (11a), (lib), and (lie) in defining regions (R(Pzx l x i y) in the 
Ri — Ri plane. But the right members of these equations are of the 
form 

I(X i; Y\X», Z) = Z P z ( Zi )I(X i; Y\X 2 , Z = Zi) 
I(X 2 ; Y\X h Z) = Z P z ( Zi )I(X 2 ; Y\X h Z = Zi ) 
I(X X , X 2 ; Y\Z) = £ P z ( Zi )I(X 1 , X 2 ;Y\Z = Zi ). 

i 

An argument just like that of Appendix I now shows that (R C convex 
hull U-- (R. where (R, is given by ^ Ri ^ I(X X \ Y\X 2 , Z = Zi ), 
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^ R» ^ /(X 2 ; Y\X h Z = Zi ), ^ fi, + R 2 ^ I{Xi, X 2 \ Y\Z = Zi ). 
Each box-like region (R, can be thought of as obtained from a distribu- 
tion in which Z takes a single value with probability one. The formula- 
tion of Section III follows at once. 

VII. COMMENTARY 

7.1 Generalizations 

7.1.1 N Input Users 

The foregoing can be generalized to the case of a memoryless 
channel with N input users and a single output. The channel then 
is specified by alphabets % 9Ci, X* • • • , 9C* and transition probabilities 
p(y\x h x 2 , ■ • • , x N ) for y E % *< G X<, * = 1, 2, • • ■ , N. Again we 
allow the information supplied to the input users to be correlated in a 
special way. 

We first write out the equations for N = 3 in full, and then indicate 
the general result. There are now seven independent sources, S h <S 2 , 
S 3 , Su, Su, S23, S123 producing information at rates Ri, R2, Rs, Ru, 
Ru, R23, #123 respectively. There are three encoders. Encoder 1 sees 
the outputs of only S h Su, Su, 8m', encoder 2 sees the outputs of 
only S2, Su, S23, S123; encoder 3 sees the outputs of only S3, Su, S23, 
S123. The decoder at the channel output attempts to reproduce sepa- 
rately the messages from the seven sources. Using block codes, for 
certain values of the rate vector R = (Ri, R2, R3, Ru, Ru, R23, #123), 
the error probability of the system can be made arbitrarily small. The 
closure of the set of all such vector rates is called the capacity region 6. 

6 can be found as follows. Let 

pmCzm), £12(212), £13(213), £23(223) 

£l(£l I 2x23, 2i2, 2i 3 ), £2(^2 I 2123, Zl2, 2 2 3), £3(x3|2i23, Z "> Z ™) (^3) 

be given probability distributions. Here Xi E 3Ci, i = 1, 2, 3. The 
Zu, Zu, etc., have finite alphabets of unspecified size. We denote by 
P the distribution 

P = £l23(2l23)£l2(2l2)£l3(2l3)£23(223)£l(Sl|2i23, 2l2, «1») 

X £2(^2 I 2123, 212, 223)£3(Z 3 I 2l23, 3l3, Z23)p(V | &1, X 2 , X3) . (64) 

Now let (R(P) be the set of R such that 

O^flig I(Xn Y\Zu3, Z n , Z n , Z23, X 2 , X3) 
^ R2 ^ I(Xa; Y\Zi23, Zu, Zu, Z23, Xi, X3) 
^ R 3 ^ I(X 3 ; Y\Zu3, Zu, Zu, Z23, X h X 2 ) 
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^ Rl + B 2 ^ I(X\, Xl\ Y\Zu3, Zu, Zu, Z 23 , X3) 

g R x + B 3 ^ /(Xi, X 3 ; F|Z m , Z 12j Z 18j Z 23 , X 2 ) 

^ B 2 + B 3 ^ ^(-^2, X 3 J F|Zl2 3 , Z12, Z13, Z23, -X"l) 

^ Rx + Ba + B 8 ^ /(Xx, X 2 , X 3 ; F|Z 123 , Z 12 , Z Mj Z 23 ) 
^ Ri + B 2 + B 3 + fiu ^ /(Xi, X 2 , X 3 ; 7 |Z 123 , Z«, Z 23 ) 
^ Bi + B 2 + B 3 + R u ^ I(X h X 2 , X z ; F|Z 128 , Z 12 , Z 23 ) 
^ Bi + R 2 + B 3 + B 23 ^ /(Xj, X 2 , Z 3 ; FJZwa, Z 12 , Z 13 ) 
^ Bi + R 2 + B 3 + R12 + Bn ^ J(X X| X 2 , X 3 ; F|Zm, Z 23 ) 
^ Bi + i? 2 + B 3 + B 12 + B 23 ^ /(Xj, X 2 , X 3 ; F|Z m , Z 13 ) 
^ B, + Ri + B 3 + B 13 + B 23 ^ 7(Xi, X 2j X 3 ; Y|Zi 23 , Zw) 
^ ft! + B 2 + B 3 + Bm + Bu + B 23 ^ I(Xi, X 2 , X 3 ; F|Z m ) 
^ Bi + B 2 + B 3 + Bi 2 + Bu + B 23 + B 123 

:g /(Xt, X s , Z 3 ; F), (65) 

where all the mutual informations here are computed with the distribu- 
tion (64). Let (R = U (R(P) where the union is over all distributions of 
form (64) as the factors listed in (63) are varied. Then e is the closure 
of the convex hull of (R. 

The generalization to N users is simple in concept but awkward to 
describe. We do not dwell long on it here. There are now 2^ — 1 
sources and e is a region in a (2^ — 1) -dimensional rate space. The list 
(63) is increased to contain 2 N — N — 1 separate distributions for as 
many independent Z variables — Z i2 , Z J3 , • • • , Z 23 , ■ ■ • , Z« 8 . ■ . N — 
and N distributions of form pi(xi|zi 2 , zn, • ■ ■ , Zu---n), etc., where 
each z subscript contains the x subscript. Equations (64) and (65) are 

N / ( N ) \ 

generalized in an obvious way. There are now Y, [2 * — 1 ) equations 

(65). C is given as the closure of the convex hull of the union of the re- 
gions defined by these equations. 

These results for N users were obtained by cursory examination of 
the rigorous proofs given in this paper for two users. As we have not 
had the courage to write out all the details, however, the assertions 
made for the iV-user case must still be regarded as conjectures, or 
educated guesses. 

7.1.2 Continuous Amplitudes 

It would appear that our results can be extended in a natural way to 
channels with more general alphabet structures. For example, the 
channel might be specified by a conditional probability density 
P(y\xi,Xi) where Xi, x 2 , and y take all real values. Equation (11) 
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would remain the same, but the mutual informations are now given by 
integrals. Densities P z (z), Px l i*(si|«), Px t izM z ) must be specified 
and the joint density of Z, X lt X 2 , and Y is the product (10) as before. 
Constraints, such as EX\ = <j\, EX\ = <r\ must be imposed on these 
densities in taking the union indicated in (12). 

Again, we have not verified in detail the validity of the determination 
of 6 just given for continuous amplitudes. Caveat emptor. 

7.2 Some Problems 

Many research problems related to the subject of this paper remain 
to be examined. A brief description of some of these follows: 

(i) The footnote in Section III suggests that the size of the alphabet 
Z can be bounded in searching for the capacity of a particular channel. 
Is this conjecture true? 

(ii) The explicit construction of good codes for use on specific 
multiple access channels is an untapped field that leads to new problems 
not found on single-input, single-output channels. For example, even 
for noiseless channels (all channel probabilities zero or one) a coding 
problem exists since users compete with each other for the use of the 
channel. 

(Hi) The region of rates for which error-free transmission with finite 
length codes is possible is not known. This region is analogous to the 
zero-error capacity of the single-input, single-output channel. 

(iv) For a particular multiple access channel it has been found that 
the region of admissible rates can be enlarged by allowing the encoders 
to observe the output via a feedback channel. This is in contrast to the 
situation for the single-input, single-output channel where feedback 
does not alter the capacity. In the multi-user case, however, a feedback 
channel increases the cooperation possible between the users and in 
general increases the forward capacity. How to calculate the region of 
admissible rates for multiple access channels with feedback is not 
known. 

(v) A special form has been assumed here for the correlation between 
the messages encoded by the two users. How does one handle more 
general correlations? Is the presently assumed form general in some 
asymptotic sense? 

(vi) Can one calculate the capacity region for some class of multiple 
access channels with memory? 

(vii) What is the rate distortion theory for these channels? 
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APPENDIX A 

Proof of Theorem 1 

Let PeiijAQ be the probability of error when the source triplet 
(i, j, k) is sent over the channel using coding C. Let Pr (C) be the prob- 
ability of the particular coding C. Then 

P.«.i,k(N, Pffi*) = E Pr (C)P eI ,-,y,*(C), (66) 

where the sum is over all possible codings, that is, over all ways of 
choosing the code words x m , • • • , *i Mo m v x 2 n, • • • , x 2Mo m v But the 
right side of (66) can be interpreted as the probability of error in the 
joint experiment of drawing a code from the ensemble and transmitting 
(t, j, k) over the channel. With this interpretation in mind, we have 

Pe«j,k(N, PIS*) = £ P t , (67) 

»~i 

where 

Pi = Pr [t/o = i, U\ * j, U* 2 = fc|(B] (68a) 

P 2 = Pr \_Ul = i, U\ = j, Ut * fc|(B] (68b) 

P 3 = Pr lUi = i, Ul * j, Ut * fc|(B] (68c) 

P A = Pr[f/o^i|(B], (68d) 

where (B is the event { U = i, Ui - j, U» = k}. We will find upper 
bounds for these four probabilities. 

We first compute an upper bound for Pi. Fix values for the N-vectors 
y, Xi.j, x 2 ik. Let z, denote the L-vector whose components en, Zii, • • • , 
ZiL were used in the choice of xui and x 2 «. Later we shall average over 
these quantities. 

Define fty as the event that 

p^ywy i x 10 -, x 2lfc ) ^ pyfajj i ma, zia) . (69) 

Note that the only random variable in this expression is Xiy. Define 

pk% K \*\z) = n pm**\**)> (70) 
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where x is the N-vector obtained by concatenating the L if -dimensional 
vectors x a , and z is an L-vector whose components are z a , a = 1, 
2, •••,!/. Then the probability of the event CV is 

where the sum is over all values of Xuy satisfying 

P*H*(y I *m», *»*) ^ P«t*(y I *«/, *«*) • (71) 

Following Gallager, 9 an upper bound to this expression is 

Pr [a,0 ^ L PjfffOcui'lZi) ( pTfe , ■ — — t ) . (72) 

Wr \ "Y|XiXAyl X l»7> X 2i*J / 

for any si ^ 0. The summation is over all values of the JV-vector Xnj: 
For the same fixed values of y, Xuj, x 2 ,k, and z i; let a be the event 
that (69) holds for some value of f not equal to j. Then from Gallager 9 
(page 136) 

Pr[a]^ (Zjriay-K 1 , (73) 

for any pi in the range ^ pi ^ 1. Combining (72) and (73) we have 
Pr [a] ^ (Mi - 1)«[l iHfflPW*) 

\PYTx I X 1 (y|Xi,-y > X2, i )/ J 

where the summation is over all iV-vectors in (Xi) N . 
The probability of interest, P x , has an upper bound 

Pi^EIII P*ft&(y|xii,-,x 8 «)Pi~ff (x iy |z«) 

x «fiif OftaWWOk) Pr [a]. (75) 

where the inequality results from the fact that the occurrence of the 
event & does not necessarily imply the event { U* Q = i, U\ 9* j,U 2 = k\ 
but that the converse is true. Combining (74) and (75) and choosing 
si = 1/(1 + pi), we obtain 

Pi ^ (Mi - i)« ZY.Z flBPCftWWW 

y « z 

X [£(P X rfz Q ( x i|z)PYTx 1 x 2 (yl x i, x 2)) 1/(1+ ' ,l) ] 1+ ", (76) 
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where the summations for y, Xi, x 2 , and z are taken over all elements in 
the spaces ( C U) A ', (9Ci)- v , (9C 2 ) jV , and (b) L respectively. 

We note from (lb) that (My - 1) < e NRl . Now use the product 
form of (70) and write the right-hand side of (76) as an exponential of 
a logarithm. We find the desired result 

Pi g exp { -iV[£i( P i, P&\ Xi ) - pifli]}, (77) 

where E\ is given by (29a). The sums there are over all Xi, x 2 , y, and z 
contained respectively in (ECi) A ', (SCa)*, ( t y)*, and 3. Reversing the role 
of Ui and £/ 2 one immediately obtains 

P 2 ^ exp \-N(E 2 ( P 2, PJ£U0 - p 2 ft)}, (78) 

where i£ 2 is given by (29b). 

The procedure for obtaining the upper bound for P 3 is very similar 
to that used for Pi. An outline of the proof follows. Fix y, x«,-, x 2lit , and 
z f . Define (B as the event 

P#ftau(y|Xi Vl ZiaO ^ P^U(y|x 10 ,x 2l -.) 

for some j' ?* j and some k' 9* k. (79) 
It can then be shown that for any S3 ^ and ^ p 3 ^ 1 

Pr [ffl] ^ (Mi - l)w(Jfi - 1)4L £ P!Bf CxillO 

L II x 2 

Y p(.V.K) /_ I _ x / PJ/] xA(ylXi,X2) \ "]" , Rm 

X-rx 2 |z (,x 2 |z,)l ~7^ — — 1 . (80; 

\ Pir|kix,(7l*i«, x 2 ,fc)/ J 

Averaging over y, xi,-,-, x 2 ,t, and z„ and then setting s 3 = 1/(1 + P3), 
we obtain 

Pz ^ (M 1 - l)p«(M, - 1)'» L L P 2 L) (z) 

y x 

X[EE Pxrff ) (xi|z)Pxyff > (x 2 |z)(P«x 1 x,(y|xi, *))i/(h-w>j+«. (81) 

II X2 

Replace (Mi - 1)" 3 (M 2 - l) p3 by the upper bound e N " i(Rl+Ri ), use (70) 
repeatedly, and write terms as exponentials of logarithms. One finds 

P 3 ^ exp ( -iV[£ 3 (p 3 , Pj£x.) ~ Pa(«i + ft)]}, (82) 

where E3 is given by (29c). 

One minor change is made in the procedure to compute the upper 
bound for P 4 . We fix only the values of y, Xiy, and x 2 ,a (but not of z,). 
Define D as the event 

Ffffi.<7 1 Xuv, X«'»0 £ Wfi.(y I xio, zk«) (83) 



1064 THE BELL SYSTEM TECHNICAL JOURNAL, SEPTEMBER 1973 

for some i' 9* i and any j' and k'. Then for any s 4 ^ 0, ^ p 4 ^ 1, 
Pr[2D] ^ (Mo - 1)w(Mi)m(M 2 )'« 

xfEEPff(, 1 ,)(^f ] )"]", m 

fEPfru x.) = £ W(z, xi, x 2 ) . (85) 



where 



Averaging over y, Xi.y, x ra and setting s 4 = 1/(1 + pO, we obtain 
Pa ^ (Mo - l)"(Mi)"(M 2 )" 

xl[IE FfiPC*, x 2 )(PYTx lXj (y |xi, x ,) l/(1+ ">] l+M . (86) 

y xi *a 

From (lb) we see that (Mo - 1) < e** - An upper bound for Mi 
follows from (lb) as 
Mi < e NRl + 1 = e NRl (l + e~ NRl ) 

= exp |J\T| Bi+ $ J 



^exp {^[«» + L ^ 1 ]}- ( 87 > 



(88) 



Using a similar upper bound for M 2 , we have that 

f r e -N(.Ri+R2)~\'] 

(Mo- l)(Mi)(M 2 ) ^ expjJVl #o + fli + # 2 + ^ Jj 

From (88) and (70) we then obtain 

Pa ^ exp { -iV[£ 4 (p 4 , P&) - pa(Ro + Bi + B 3 )]}, (89) 
where £ 4 is given by (29d). Summing (77), (78), (82), and (89) results 
in (27) which was to be proved. 

APPENDIX B 

Proof of Theorem 3 

It can be easily verified that 

E a ( Pa , Pjjg*) | PQ =o =0 for a = 1, 2, 3, 4. (90) 

It can also be shown by a straightforward but tedious calculation that 



OEa 

dp a 



Pa = 



I/(Xi;Y|X 2 ,Z), 
I/(X 2 ;Y|Xi,Z), 
±I(X h X 2 ;Y\Z), 

j e -N(Ri+R2) 

-^ /(Xi, X 2 ; Y) jy 



a = 1 

a = 2 

a = 3 

a = 4 



(91) 
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where the Pa are mutual informations among it-vectors as computed 
under the joint distribution (32). Furthermore, from (29) it is seen that 
E a is analytic in p a in the neighborhood of p a = and so can be ex- 
panded in a Taylor series about this point, a = 1, 2, 3, 4: 



E«(p a , PjRri 

^ IaPa + O a ( P 2 a ), a = 1, 2, 3 

(92) 



+ I K h N I p4 + 0i{pl) ' a = 4 - 

Here /„ is the appropriate expression from (91) and is the 
usual Bachmann-Landau order-of-magnitude symbol. Furthermore, if 
R C (RiP^xx), we have from (31) that 

^ /„ - R a = 8 a > 0, a = 1, 2, 3 4. (93) 

Combining (92) and (93) with (30), we see that 



P e (C N ) =g £ exp {-iV Pa [5 a + O a (p2)/ Pa ]} 

a— 1 

f r />-iV(Ri+fla) 

+ exp <j -N Pi [t A ^ + 4 (p!)/p 4 

Now choose the integer L so large that 



(94) 



5< = 8a — 



IK 



is positive. Next, choose sufficiently small positive values of pi, p 2 , P3, p* 
so that 5 a + 0M)/p o > 0, a = 1, 2, 3, and 5 4 + OM)/pa > 0. The 
coefficient of N in each exponential of (94) is now negative, and we 
can increase N in multiples of K starting at N = KL until each term 
of (94) is less than e/4. Call this value of N, N = KL . Then (33) 
follows. Q.E.D. 

APPENDIX C 

Proof of Lemma 1 

By definition of P el (C K ) and H{Ux\Y, U , U 2 ), 

P.i(Ck) = £ E L Z PffiTMYd, J, K y), (95) 

y i i^j*(y) * 
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and 

h{Ux\ y, u , (/ 2 ) = nii PflBwwft i» k > y) 

y » j A 

xlog P^ P ,Jil^,y) - (96) 

By separating out the terms for which j = j*(y) in (96), one finds the 
identity 

T m H(Ux\Y, U 0) £/ 2 ) - P e i(C K ) log (Mx - 1) - h(P el (C K )) 

= Z E E E PflBwwtt h k, y) 

y i i^i*(y) *= 

vl P ^ Ck) 

Xl08 (M 1 -l)P%} VoUlY U\i,k,y) 

+ '£'£'£ Pffihuxii, f(y), k, y) 

y i k 

xlog PK|™»U*Cy>K*,J>' (97) 

Now use the fact that log x = x — 1 to obtain 

r^EE E E I" f ' 1 ^ ,T(l 'n fc ' y) - ^.^ J» fc > f> 1 
+ E E E [(i - P.OPffl^T(i*l*, fc, y) 

y i k 

- PuMujih i*(y), fc, y)] 

= P el + (1 - P el ) -1=0. (98) 

Replacing Mi by Mi + 1 yields (47a). Equations (47b), (47c), and 
(47d) are proved in a similar way starting from the definitions 

Pe2(C K ) = ZZZ E Pffiwtfrft h K y), (99a) 

y ,- j k?*k*(j) 

P e z(C K ) = EL E P«W(», j, fc> y). 09b) 

y » (>,t)^(j*(y),fc*Cy)) 

and 

P e (C K ) = E E PKbuvfo J, K y) • (99c) 

y (i,/,A)*(»* ./>.*•) 

APPENDIX D 

Proof o/ Lemma 2 

For part (a) , we write a complicated conditional mutual information 
in two different ways: 

7(Xi, tfi;Y|tf 2 ,X 2j C/o) 

= Z(Xi; Y| I7 a , X 2 , [/„) + /(f/i; YlXx, tf 8 , X 2 , [/„) 

= /(C/i; Y| U 2 , X 2 , £/„) + J(Xi; Y| Ui, U 2 , X 2 , C/ ) . (100) 



MULTIPLE ACCESS CHANNELS 1067 

Now 

7(X i; Y|t/„, U 2 ,X 2 ) 

= E Jloe QXJftgjgggiffl ^°> ^*> ^ 2 > jjj XiQ 

I nTu^Yic/o, c/ 2 ,x 2 ) 

= * {log fffefj } = /(X i; Y| C7o, X 2 ), (101) 

where the equalities result from the special form of the joint distribu- 
tions as given by (39) and (40). For the next mutual information in 
(100), we have 

I(Ui;Y\Uo, U 2 ,X h X 2 ) 

_ ft J j ^Ypl7oC/it/>XiXa(Y| Uo, U\, U 2 , Xl, X 2 ) 



P¥?uovAx,(Y\U ,U 2 ,X h X 2 ) 

-*{* ifegftg >- - (102) 

The third mutual information in (100) can be written 
I(Ui;Y\Ui, U 2 ,X 2 ) 

— v J \ PtffuoUiuJiAY \ Uo, U\, Uz, X 2 ) 

I s nTU*(Y|c/ , u 2 ,x 2 ) 

- ^{ iog P ^s^y } - iiu « riu * *>• <io3) 

Finally, 

/(X i; Y|C7 0j Ui, U 2 ,X 2 ) ^ 0, (104) 

since all mutual informations are non-negative. Combining (100)- 
(104), we obtain (48a) which completes the proof of part (a). 
The proofs for parts (b), (c), and (d) follow in a similar manner. 

The equations corresponding to (100) are: 
Part (b), 

I(U 2 ,X 2 ;Y\U , Ui,Xi) 

= I(X 2 ;Y\U , U 1 ,X 1 ) + I(U 2 ;Y\U , U h Xi, X a ) 

= I(U 2 ;Y\U , Ui, X,) + 7(X 2 ; Y| U , U h U 2 , Xi); (105) 

Part (c), 

I{U h U 2 ,X h X 2 ;Y\U») 

= Z(Xx, X 2 ; Y| U ) + I(U h U 2 ;Y\U , Xi, X,) 

= I(U h U 2 ;Y\Uo) + I(X h X 2 ;Y\U , U u U 2 ); (106) 
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Part (d), 

7(C/o, U h U 2 ,X h X 2 ;Y) 

= Z(Xi, X 2 ; Y) + I(U , U h U 2 ;Y\X h X 2 ) 

= I(U 0) U h U t ;Y) + /(Xi, X 2 ; Y | U 0) U h U%) . (107) 

Q.E.D. 

APPENDIX E 

Proof of Lemma 8 

We use the identities: 

(a), I(UuY\ U a , U s ) = H(Ui]U , U 2 ) - H(Ui\ U , U h Y); (108a) 

(b), J(C/ 2 ;Y|U , C/i) = H(U*\U 0) C/i) - H(U 2 \U , U h Y); (108b) 

(c), I(U h U 2 ;Y\U ) = H(U h U*\U») - H(U 1 , U 2 \U ,Y); (108c) 

(d), I(U , U h U 2 ;Y) = H(Uo, U h U 2 ) - H(U Q , U h U 2 \Y). (108d) 

From the joint distributions of the random variables U 6 , U h and U 2 
given in (2), we have: 

(a), H(Ui\ U , C/i) = H(Ui) = KR[; (109a) 

(b), H(U 2 \U , Ui) = H(U 2 ) = KRi, (109b) 

(c), H(Ui, U t \U») =H(U h U 2 ) 

= H{U X ) + H(U 2 ) = K(R[ + R' 2 ); (109c) 

(d), H(U a , Ui, Ut) = H(U ) + HiUJ 

+ H(U 2 ) = K{Ro + R[ + ^2) ■ (109d) 

Combining the appropriate equations in (108), (109), (41), (47), and 
(48), we have (49) which was to be proved. 

APPENDIX F 

Proof of Theorem 5 

If R is an interior point of S e , then there is a sphere, a, of radius 
tj(R) > 0, centered on R such that every point in a is also in S°. Thus 
every point in S(Q$x iX ,y) must be distant more than t?(R) away from 
R, and this is true for every K, and every Q$Lx, in £*• This in turn 
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implies that one of the inequalities 

Ri-^I(Ki;Y\U.,X 2 ) >t 7 (R) 



(HO) 



fl, - ^7(X 2 ;Y|C/„,Xx) > ,(R) 
fli + B 2 - ^ /(Xx, X 2 ; Y| J7 ) > r?(R) 

Bo + Bi + B a - i /(Xx, X 2 ; Y) > ,(R) 

must hold for every encoding Ck(R) of the sort under consideration, 
whenever R is interior to S c . 

Now from (41), B« ^ R a , a = 0, 1, 2, so that one of (110) holds also 
when the B's are replaced by R"s. From Lemma 3 we then find that 

%,Pea(C K (R)) + ^ h(P ea (C K (R))) > Ij(R) 

for at least one a, a = 1, 2, 3, 4, (111) 
where we define 

P.*(C K (R)) = P.(Cjc(R)) (112) 

and for any rate vector R we define an associated 4-vector R by 

(Ai, R lt R z , A) = (Bi, B 2 , Bi + B 2 , Bo + i2i + fl«) . (113) 

But from (87) and (88) we see that 



e -KR a 



^^. + -^~ ^ (A* + e~*«), (114) 



so that 



#P..(C*(R)) + Mg^W) 

^ (B a + e~^)P £a (C K (R)) 4- /t(P..(Cic(R))). (115) 
Combining (111) and (115) we find that 

(A a + e-Z°)P ea (C K (R)) + h(P ea (C K (R)) ^ tj(R) 

for at least one a, a = 1, 2, 3, 4. (116) 
Now 

/i(z) ^ 2vx, ^ .r ^ 1, (117) 
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as can be seen by the following simple argument. From 

o < *[i + (i - z) 2 ], 

it follows that 2Z < 1 + Z + Z 2 /2 ^ e z , for Z ^ 0. But for Z < we 
also clearly have 2Z < e z so that 2Z < e z for all Z. Substitute 
Z = log V(l - <)/* to obtain 

log V<^<7r 0<i<1 



or 






< X < 1. 



Perform the integration and take the limit as e — »0. Equation (117) 

results. 

Use (117) in (116) to find that (A a + e~*°)P ea + 2VP ea ^ r?(R) 
for at least one a,a = l,2,3,4. This implies that 



P .. (Clt( R)) 4 vi+ ^;;r ] -^r = mr) > °- 

Since P e (C K (R)) ^ max* [P,«(C/e(R))l we find finally that 

P.(C K (R)) ^ 5(R) > 0, (118) 

where 5(R) = min a 8 a (R) is independent of K and the encoding 
Gc(R). Q.E.D. 

APPENDIX G 

Proof of Theorem 6 

We first show that for every positive integer n and every integer r 
such that ^ r ^ n, 

RiGtft, R 2 G (R=>R 3 --Ri + ^-^Rse (R. (119) 

Since G is the closure of (R, and since the rationals are dense in the 
reals, (119) implies that if R x £ 6 and R 2 E 6, then for every X, 
£ X ^ 1, R 3 = XRi + (1 - X)R 2 E C, which shows e to be convex. 
To establish (119), we use the notion of time sharing to generate 
new codings from old ones. Suppose we have two codings Cat(Ri) and 
Cat(R 2 ) both of block length N and with numbers of words Mi and M 2 
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respectively, where as usual 

M a = (M 0a , M la , M 2a ) = (r e x*oa-\ ) 1*4**1.1 rvv*i.i) 

a = 1, 2. (120) 

Denote by P«i and P e2 the respective error probabilities achievable 
with Cjv(Ri) and Cat(R 2 ). Now consider the possible channel input 
vectors that can be obtained by using C N (Ri) r times followed by 
(n - r) uses of Cat(R 2 ). The totality of these input vectors, each of 
nN components, can be thought of as the words of a new code of block 
length nN. Denoting its word size parameter by M, we have 

Mi = (M a y{M i2 y-', i = 0, 1, 2. (121) 

If we use the decoders for Cy(Ri) and C N (R 2 ) to decode the appropriate 
blocks of length N in this new larger code, the error probability for 
the new code, P e , will satisfy 

1 - Pe = (1 - PeO r (l - P e2 y~' £ (1 - rP el )[l - (n - r)P e2 ] 

^ 1 - rP el - (n - r)P e2 
so that 

Pe ^ rP el + (n- r)P e2 . (122) 

Here we have used the fact that the channel is memoryless. 

We now use this time-sharing notion to establish (119). Suppose that 
integers n and r are given with n > 0, g r g n and that Ri and R 2 
are rate points in (R. Suppose further that e > is given. Then, from 
Theorem 4, there exist positive integers Ki and L x and a sequence of 
codings CjtjCRi). Ni = KiLi, Ki(Li + 1), K^U + 2), • • • such that 
for each coding of the sequence P e (C Nl (Ri)) > e/n. Similarly there 
exist integers K 2 and L 2 and a second sequence of codings CAr,(R 2 ), 
N 2 = K 2 L 2 , K 2 (L 2 -\- 1), K 2 (L 2 + 2), • • • such that for each coding in 
the sequence P e (C^ 2 (R 2 )) < e/n. We now choose one coding out of 
each of these sequences of codings in such a way that they are of the 
same block length X. A suitable choice for N is the least common 
multiple of KiU and K 2 L 2 . Call the two codings Cjv(Ri) and Cat(R 2 ). 
Their error probabilities are P eX < e/n and P e2 < e/n. Time sharing 
them as discussed earlier yields a new coding C, of block length 
Nz = nN, code book size M given by (121) and (120), and error 
probability 

Pe ^ rP el + (n - r)P e2 = r - + (n - r) - = e 

n n 
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from (122). Now from the fact that W r y^ ^ r xy~\, (121) and (120) 
give 

^ r e N 3 R i3 i ) i = 0, 1, 2, 

where R 3 is as in (119). Thus by deleting some words from the code C 
we can obtain a coding with rate R 3 , and block length N 3 , that has 
error probability P e < e. Q.E.D. 

APPENDIX H 
Proof of Lemma 4 

Consider (57a). We write 

r<Xi;Y|Z,Xi) 

F1 Pffli*.(Y|Z,X 1> X 1 ) 
" &l0g Ptfk,(Y|Z,X 2 ) 



A' 



II -FViXiXj^tl-Xu, X 2 () 



- * log -r 



(=i 



II P^zx 1 y,..y,-XY t \z, x 2 , y lf • • • , y,_o 

- £ CH(7,|Z, X 2 , F 2 , • • • , F.-i) - H(Y t \X u , X u )2. (123) 



K 



1 = 1 



Here, for t = 1, the conditioning on Fi, • • ■ , F,_i is to be omitted. But 

H(Y t | Z, X 2 , Fx, • • • 7,-0 £ ff(Y,| Z, X.0, (124) 

since removing conditioning random variables cannot decrease an 
entropy. Combining (123) and (124) we have 



K 



J(Xi;Y|Z,X 2 ) ^ £ Dff(Yi|Z,X„) -JflMX^Xii)! (125) 

(--i 

or 

/(Xi;Y|Z,X 2 ) ^ £ /(Zi,; F,|Z,X 2t ). (126) 

t—i 

The proofs for (57b), (57c), and (57d) are similar. 

APPENDIX I 

Let numbers A h B t , C h and D t be given that satisfy the inequalities 
^ A t & C h (127a) 
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g B t g C h 

^ C, £ A, + 5,, 

g C ( $ D„ * = 1, 2, ... ,K. 
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(127b) 
(127c) 
(127d) 



Let (R t denote the set of points (x, y, z) in three-space such that 

0gx£A t , (128a) 

%y g B t , (128b) 

0^x-hy^C t , (128c) 

^ x + i/ + z ^ D t , (128d) 

for £ = 1, 2, • • • , K. A sketch of (R t is shown in Fig. 7, corresponding 
to the case in which all the inequalities in (127) are strict. We further 
define 



(R= (Rt. 



(129) 



Now consider the region (R consisting of all points (x, y, z) such that 

g x g; An = \. Z A. t , (130a) 




Fig. 7 — The convex region (Rt. 
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^ y ^ Bo = \ E B t , (130b) 

A t = l 

£ x + y £ Co - 4 E C t , (130c) 

^ x + y + z ^ D Q = ^ E -D,. (130d) 

A <=i 

Our first goal in this appendix is to show that 

do Q convex hull <R. (131) 

By summing the inequalities (127) and using the definitions of Ad, 
B , Co, and Do given in (130), we see that (127) also holds for t = 0. 
(Ro, too, then has the form shown in Fig. 7. As is seen, each region 
ft tf t = 0, 1, ■ • • , K, is convex and has ten extreme points, the com- 
ponents of which are listed below: 

r , = (0, 0, 0) 

m = (A u o, o) 

r 2 « = (At, C t - At, 0) 

t 31 = (Ct - Bt, B h 0) 

tu = (0, B h 0) 

r M - (A h 0, D, - A t ) 

r« = (At, Ct - A t , D t - C t ) 

i n = (Ct - Bt, B t , Dt - C t ) 

T 8t = (0, B h D t - B t ) 

n, = (o,o, D t ). 

[Some of these points may coincide if there are equalities in (127) 
instead of strict inequalities.] For the extreme point of (R we also have 

r.o = i E r«, * - 0, 1, - • , 9 (133) 

A i-i 

which follows directly from (132) and the definitions on the right of 
(130). We recall that a convex body is characterized by its extreme 



(132) 
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points: r £ (R t if and only if 

9 

t = £ XiT it , (134) 

i— 

where 



X; = 0, i=0, 1, •••, 9 and £ X, = 1, f - 0, 1, • • • , 9. (135) 



Equation (131) is now easy to establish. It is clear that the convex 
hull of (R is the set of all points that can be written in the form 



9 K 

: e 

= t~-l 



T ' = £ Z UuTu, (136) 

where 



M« = 0, i=0, 1, ••-,9, t-l,-..,*, f:£tt«-l. (137) 

«=0 ( = 1 

Now let r be any element in (R . Then r can be written in the form 
(134)-(135) with t = 0. Substituting from (133) yields 

9 K X 
r = £ £^r„. (138) 

But denning 

w« - g , * = 0, • • • , 9, « = 1, 2, • • • , if, (139) 

we see that w« ^ and 

9 K 

E I 



EI«i = l. (140) 



Comparison with (136) now shows that r is in the convex hull of (R. 
Equation (131) then follows. 

The application of the foregoing to (60) is immediate. Let 

A t = I(X lt ;Y t \X 2t ,Z) 
B t = I(X 2t ; Y t \X u ,Z) 
C, = I(X u ,XttiY t \Z) 
D t = I{X U , X 2t ; Y t ) 

t = 1, • • • , 9. Equations (127) are satisfied. We then identify (R t of 
this appendix with &(P Z x u x lt Y t ) of (60), t « 1, 2, • • • , K, and (R with 
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(R*(P$> XlY ) of (59) which is consistent with (130). Then (129) and 
(131) yield (60). Q.E.D. 
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